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Abstract. School psychologists will likely become more involved in supporting 
the reading achievement of English language learners (ELLs). This requires 
evidence-based interventions that are validated for ELL students. Incremental 
rehearsal (IR) is an evidence-based intervention for teaching words, but the 
resource intensity often precludes its use. Using peers as interventionists may 
increase the contextual validity of IR while maintaining the benefits when 
compared with other drill techniques. This efficacy study examined if (a) peer- 
mediated IR (PMIR) was effective for teaching ELL students high-frequency 
words and (b) improvements in word reading generalized to changes in students’ 
oral reading fluency. Five ELL students participated in a randomized multiple- 
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baseline design across participants. Results indicated that PMIR was functionally 
related to an increase in word reading for all 5 participants. Effect sizes estimated 
using Taut/ and multilevel modeling indicated that PMIR had a large effect on 
sight-word reading. No functional relationship between PMIR and oral reading 
fluency was observed. PMIR was generally acceptable to target students and peer 
tutors. Limitations and potential implications of the results are discussed. 


English language learners (ELLs), or 
youth who do not speak English at home and 
whose English proficiency limits their ability 
to access grade-level material, represent one 
of the fastest growing populations in the 
United States (Kena et al., 2014). Poor reading 
achievement has been a long-standing concern 
for ELL students (e.g., U.S. Department of 
Education, Institute of Education Sciences, 
National Center for Education Statistics, 
2015), suggesting that school psychologists 
will be increasingly involved with supporting 
ELL students’ reading achievement (August, 
McCardle, & Shanahan, 2014). This under¬ 
scores the need for evidence-based interven¬ 
tions (EBIs) validated for ELL youth (Moore 
& Klingner, 2014). 

Despite the progress in the identification 
of EBIs over the past decade, data suggest the 
use of EBIs remains limited in schools (e.g., 
Crosse et al., 2011; Kretlow & Helf, 2013). 
Bridging the gap between research and prac¬ 
tice requires efficacious interventions that are 
contextually valid. The amount of teacher time 
and number of resources needed to deliver 
EBIs are critical threats to contextual validity 
(Skinner, McCleary, Skolits, Poncy, & Cates, 
2013). Modifying existing EBIs is a promising 
approach to increase the contextual validity of 
these practices. The purpose of this study was 
to examine the efficacy of one such modifica¬ 
tion, using peer interventionists to implement 
incremental rehearsal (IR; Tucker, 1989). 

READING DEVELOPMENT AND 
INTERVENTION FOR ELL 
STUDENTS 

Reading is a complex process that incor¬ 
porates multiple core skills—phonemic aware¬ 
ness, phonics, fluency, comprehension, and 
vocabulary (National Institute of Child Health 


and Human Development, 2000). ELL stu¬ 
dents will generally benefit from systematic, 
intensive instruction in these core areas (Au¬ 
gust et al., 2014; Goldenberg, 2010). Word 
reading, or the ability to recognize single 
words by sight, is a foundational literacy skill 
for native English speakers (Hudson, 
Torgesen, Lane, & Turner, 2012; Samuels & 
Flor, 1997) and ELL students (August & Sha¬ 
nahan, 2006). Word reading is a predictor of 
later reading fluency (Hudson et al., 2012; 
Speece & Ritchey, 2005), and word reading 
skills may mediate the relationship between 
phonological awareness and reading fluency 
(Yaghoub Zadeh, Farnia, & Geva, 2012). 
Early deficits in word reading skills predict 
later delays for ELL students, suggesting that 
word reading is an important intervention tar¬ 
get (Lesaux, Rupp, & Siegel, 2007; Vaughn, 
Mathes, Linan-Thompson, & Francis, 2005). 

Intensive instruction for ELL students 
should be explicit and direct, maximize oppor¬ 
tunities to respond, provide increased repeti¬ 
tion of foundational skills, and provide imme¬ 
diate corrective feedback (Albers & Martinez, 
2015; August & Shanahan, 2006). School psy¬ 
chologists should search for interventions that 
incorporate these aspects when working with 
ELL students. One such intervention is IR 
(Tucker, 1989). 

IR involves the teaching of unknown 
items interspersed with a high percentage of 
known items. Interventionists provide direct 
and explicit teaching of unknown material 
when introducing unknown items. IR allows 
for substantial repetition and reinforcement of 
unknown and known vocabulary words. IR 
includes a large amount of scaffolding through 
the modeling of unknown items and immedi¬ 
ate corrective feedback during practice. More¬ 
over, interventionists can closely control the 
instructional level by adjusting the ratio of 
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unknown to known words used per session 
and can adjust the amount of new information 
presented based on a student’s acquisition 
rate. Providing interventions at the appropriate 
instructional level and avoiding exceeding a 
student’s acquisition rate can lead to increased 
learning and retention of unknown material 
(Burns, 2007; Helman & Burns, 2008). 

EVIDENCE SUPPORTING THE USE 
OF IR FOR WORD READING 

There is relatively robust evidence sup¬ 
porting the use of IR for teaching unknown 
words. In a recent meta-analytic review, 
Burns, Zaslofsky, Kanive, and Parker (2012) 
found that IR was associated with a moderate 
effect on word reading across 12 studies using 
single-case or group designs, <j> = .62, 95% 
confidence interval (Cl) [.54, .70]. In addition, 
using IR to teach unknown words had a mod¬ 
erate effect on students’ oral reading fluency 
(ORF) in three studies, 4 . = .61, 95% Cl [.36, 
.86], Only one of the studies included in the 
review of Burns et al. (2012) targeted ELL 
students. Matchett and Burns (2009) found 
that using IR to teach words led to increases in 
word reading fluency and ORF for an ELL 
student. 

Further support for the use of IR with 
ELL students comes from three recent studies 
targeting letter-sound expression. Two studies 
found that traditional IR (i.e., using flashcards) 
was effective for teaching ELL students letter 
sounds (Peterson et al., 2014; Rahn et al., 
2015). In addition, a computerized version of 
IR was effective for teaching letter-sound ex¬ 
pression to ELL students in kindergarten and 
first grade (DuBois, Volpe, & Hemphill, 2014). 

Intervention Efficiency 

Previous research has suggested that IR, 
and flashcard interventions more generally 
(Tan & Nicholson, 1997), is an effective prac¬ 
tice for teaching words to native English 
speakers. Research demonstrating that IR was 
effective for improving letter-sound expres¬ 
sion (DuBois et al., 2014; Peterson et al., 
2014; Rahn et al., 2015) and word reading 
(Matchett & Burns, 2009) lends support for 


using IR with ELL students. However, IR is 
relatively inefficient compared to other flash- 
card interventions, which is a critical consid¬ 
eration for practitioners. 

Procedures among flashcard interven¬ 
tions are varied. Traditional drill (TD) in¬ 
cludes the rehearsal of a set of unknown items, 
whereas IR incorporates the practice of known 
items. Interspersing known items may double 
the time necessary for the intervention (Burns 
& Sterling-Turner, 2010) or reduce the num¬ 
ber of items taught within a controlled period 
(Joseph, Eveleigh, Konrad, Neef, & Volpe, 
2012 ). 

When intervention duration is fixed, TD 
will result in more words being taught and 
consequently more words being initially re¬ 
tained. However, IR is generally associated 
with enhanced long-term maintenance and 
generalization of material (Burns & Sterling- 
Turner, 2010; Joseph et al., 2012; Nist & Jo¬ 
seph, 2008). This pattern is consistent with re¬ 
search on memory and learning that underscores 
the importance of spacing within the practice of 
unknown material (Varma & Schleisman, 2014). 
Practitioners should carefully balance findings 
related to initial retention, maintenance, and gen¬ 
eralization with the time and resources necessary 
to provide either TD or IR. 

MODIFYING IR TO PROMOTE 
EFFICIENCY 

Given the need for efficient interven¬ 
tions that increase maintenance and general¬ 
ization, researchers have recently examined 
the use of computerized IR to improve ELL 
students’ letter-sound expression and fluency. 
IR delivered via computer retained the inter- 
spersal of known items and high number of 
opportunities to respond but enhanced the ef¬ 
ficiency of IR by (a) increasing the number of 
students who could receive the intervention in 
a given time frame, (b) decreasing the number 
of materials needed, and (c) automating data 
collection (DuBois et al., 2014; Volpe, Bums, 
DuBois, & Zaslofsky, 2011). Adult intervention¬ 
ists were still needed to provide immediate cor¬ 
rective feedback during intervention sessions. 
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Another potential method for reducing 
the intervention intensity of IR is using peer- 
mediated instruction. Peer-mediated instruc¬ 
tion includes strategies that use peers as teach¬ 
ers to provide individualized instruction, 
practice, and repetition (Utley & Mortweet, 
1997). For example, the well-researched Peer- 
Assisted Learning Strategies program trains 
peer tutors to model oral reading, provide sup¬ 
port using comprehension strategies, and pro¬ 
vide corrective feedback (Fuchs & Fuchs, 
2005). Peer-mediated instruction can be used 
to facilitate best practices in teaching ELL 
students reading, including providing addi¬ 
tional opportunities to respond, increased 
practice opportunities, and immediate correc¬ 
tive feedback (Greenwood, Arreaga-Mayer, 
Utley, Gavin, & Terry, 2001). 

Several meta-analytic reviews have un¬ 
derscored the empirical support for using 
peers as teachers. For example, Hattie (2009) 
found that the average overall peer tutoring 
effect was moderate (d = 0.55) across 14 
meta-analyses that included 767 original stud¬ 
ies. Similarly, Bowman-Perrott et al. (2013) 
synthesized single-case research on peer tutoring 
programs and found that peer tutoring had mod¬ 
erate to large effects on reading outcomes (e.g., 
sight-word acquisition, reading) across 10 stud¬ 
ies. Specific to ELL students, a review by the 
Institute of Education Sciences found strong ev¬ 
idence for providing structured activities that 
pair students with different English proficien¬ 
cies (Gersten et al., 2007). Peer-mediated IR 
(PMIR) could provide a structured format for 
pairs of ELL students to practice unknown 
material. Training peers to deliver IR would 
maintain key causal mechanism components 
of IR such as frequent opportunities to 
respond, immediate corrective feedback, 
and expanded practice of known material 
(Varma & Schleisman, 2014). PMIR does 
not reduce the materials necessary, but it 
could reduce the adult time necessary to 
implement the intervention. 

PURPOSE 

Although IR may enhance maintenance 
and generalization of unknown material, the 


resource intensity of the intervention greatly 
limits its contextual validity. Modifications 
to reduce the resource intensity of IR, such 
as using peer interventionists, may improve 
the efficiency of IR and its usefulness for 
applied settings. The purpose of this study 
was to investigate the efficacy of PMIR for 
teaching high-frequency words to ELL stu¬ 
dents. We had three major research ques¬ 
tions: 

1. Is there a functional relationship be¬ 
tween PMIR and an increase in the level 
of students’ word reading on previously 
unknown words? 

2. Is there a functional relationship be¬ 
tween PMIR and an increase in the level 
of students’ ORF (i.e., would the effects 
of PMIR on word reading generalize to 
ORF)? 

3. To what extent are the words taught 
maintained after discontinuing PMIR? 

We hypothesized that PMIR would have 
a large, immediate effect on students’ word 
reading. On the basis of the link between word 
reading and ORF (Hudson et al., 2012; Speece 
& Ritchey, 2005) and a meta-analytic review 
of IR (Burns et al., 2012), we hypothesized 
that PMIR would have a small, delayed effect 
on students’ ORF level. Finally, we hypothe¬ 
sized that the effects of PMIR on word reading 
would be maintained. 

METHOD 

This study was conducted at a public 
charter school in Milwaukee, Wisconsin. The 
school served 432 students during the 2013- 
2014 school year. Approximately 91% of 
these students were identified as Hispanic, 4% 
as Black, 4% as White, and 1% as other. 
Nearly all students (98%) were eligible for 
free or reduced-price lunch. Five percent of 
students received limited English proficiency 
services. Pseudonyms for participants are used 
throughout the article. The Institutional Re¬ 
view Board at the University of Wisconsin- 
Milwaukee approved the study prior to initia¬ 
tion of all research activities. 
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Target Students 

Classroom teachers nominated second- 
grade (Iker, Sofia, and Tomas) and third-grade 
(Martin and Lucia) students who could benefit 
from additional reading support based on their 
reading performance in the classroom. All five 
nominees participated in the study, after we 
obtained parent consent and student assent. 
None of the participants received limited Eng¬ 
lish proficiency services. Data regarding lan¬ 
guage proficiency were not available for these 
students. All students completed the winter 
administration of the Measures of Academic 
Progress (Northwest Evaluation Association, 
2009) 2 weeks prior to the study. Scores from 
the Measures of Academic Progress are pre¬ 
sented in Rasch unit (RIT) values that are 
comparable across grade levels. 

Martin 

Martin was an 8-year-old. Hispanic boy 
in third grade. Martin had attended the school 
for the past 1.5 years and had moved to Wis¬ 
consin from Mexico the previous spring. His 
reading performance on the winter administra¬ 
tion of the Measures of Academic Progress was 
at the first percentile nationally (RIT = 135). 

Lucia 

Lucia was an 8-year-old, Hispanic girl 
in third grade. Lucia was in her first year of 
attendance at the school. She was born in the 
United States and previously attended a public 
school that had a bilingual program. Lucia’s 
reading performance on the winter administra¬ 
tion of the Measures of Academic Progress was 
at the 10th percentile nationally (RIT = 168). 

Iker 

Iker was a 7-year-old, Hispanic boy in 
second grade. He had attended the school 
for 1.5 years and moved to Wisconsin from 
Mexico in the spring of 2014. Iker’s reading 
performance on the winter administration of 
the Measures of Academic Progress was at the 
first percentile nationally (RIT = 138). Iker’s 
class received informal bilingual reading in¬ 
struction from the school’s music teacher ap¬ 
proximately once per week. 


Sofia 

Sofia was a 7-year-old, Hispanic girl in 
second grade. Sofia emigrated from Mexico, 
and this was her first year attending school in 
the United States. Her reading performance on 
the winter administration of the Measures of 
Academic Progress was at the first percentile 
nationally (RIT = 126). Sofia’s class received 
informal bilingual reading instruction from the 
school’s music teacher approximately once 
per week. 

Tomas 

Tomas was a 7-year-old, Hispanic boy 
in second grade. Tomas emigrated from Mex¬ 
ico, and this was his first year attending school 
in the United States. Tomas’ reading perfor¬ 
mance on the winter administration of the 
Measures of Academic Progress was at the 
first percentile nationally (RIT = 148). To¬ 
mas’ class received informal bilingual reading 
instruction from the school’s music teacher 
approximately once per week. 

Peer Tutors 

Teachers were also asked to nominate 
high-achieving students with well-developed 
English proficiency who they believed could 
effectively deliver PMIR under the supervi¬ 
sion of an adult. Teachers identified four stu¬ 
dents in third grade who were asked to partic¬ 
ipate after receiving parent consent. Target 
students and peer tutors were not matched 
based on grade or gender. Target students 
worked with at least two tutors; however, we 
could not randomly assign target tutors to stu¬ 
dents because of classroom schedules. 

Rosa 

Rosa was an 8-year-old, Hispanic girl 
who had attended the school for 2 years. She 
reported that she primarily spoke Spanish at 
home. Rosa’s reading performance on the 
winter administration of the Measures of Ac¬ 
ademic Progress was at the 17th percentile 
nationally (RIT = 181). 

Maria 

Maria was an 8-year-old, Hispanic girl 
who had attended the school for 2 years. Maria 


126 



Peer-Mediated Incremental Rehearsal 


reported that she primarily spoke Spanish at 
home. Her reading performance on the winter 
administration of the Measures of Academic 
Progress was at the 36th percentile nationally 
(RIT = 190). 

Carlos 

Carlos was an 8-year-old, Hispanic boy 
who had attended the school for 2 years; how¬ 
ever, he moved back to Mexico for part of his 
first-grade year. Carlos reported speaking 
Spanish and some English at home. His read¬ 
ing performance on the winter administration 
of the Measures of Academic Progress was at 
the 79th percentile nationally (RIT = 208). 

Michael 

Michael was an 8-year-old, Hispanic 
boy who had attended the school for 2 years. 
Michael reported primarily speaking Spanish 
at home. Michael’s reading performance on 
the winter administration of the Measures of 
Academic Progress was at the 66th percentile 
nationally (RIT = 202). 

Measures 

Prior to the baseline phase, we assessed 
target students’ ability to read words from the 
Fry list of high-frequency words. A word was 
considered known if the participant correctly 
pronounced the word within 3 s. A word was 
considered unknown if the participant took 
longer than 3 s to read it correctly or read the 
word incorrectly. We created a set of 30 
known words and a set of 80 unknown words 
for each target student. We wrote each word 
on the front of 3 X 5-in. index cards using a 
black marker. The word was also written on 
the back using a red (unknown words) or 
green (known words) colored pencil to help 
tutors differentiate between known and un¬ 
known words during intervention sessions. 

Word Reading 

The primary dependent variable was 
word reading accuracy. We assessed students’ 
ability to read words individually by shuffling 
the selected index cards and presenting each 
word to the target student. The student was 
then asked to read each word. A word was 


considered correct if the participant read the 
word correctly within 3 s. A word was not 
considered correct if the participant took lon¬ 
ger than 3 s to read the word correctly, read 
the word incorrectly, or did not provide a 
response. No corrective feedback was given 
during word reading assessments. Sessions 
were scheduled so that the number of days 
between each session ranged from 0 (i.e., con¬ 
ducted the next day) to 1. 

We assessed students’ ability to read 
three words that were randomly selected from 
their unknown set during each baseline and 
intervention session. Previous research found 
that the average acquisition rate of ELL stu¬ 
dents with very limited English proficiency 
(M = 3.2) was lower than the acquisition rate 
of students with moderate (M = 5.5) or well- 
developed English proficiency (M = 7; Burns 
& Helman, 2009). Without access to data re¬ 
garding students’ English proficiency, and 
considering the length of time each participant 
had received English instruction, we conser¬ 
vatively chose to teach three words per session 
to avoid exceeding students’ acquisition rates. 
We randomly selected three words from the 
student’s set of unknown words prior to each 
intervention session. We assessed students’ 
ability to read the words that were taught the 
previous session, prior to teaching three new 
words. Words were only taught once during 
the intervention phase. 

In order to have comparable values 
across baseline and intervention phases, we 
assessed students’ ability to read three words 
during each baseline session. Prior to each 
baseline session, we randomly selected three 
words from the student’s set of unknown 
words. Words could only be selected once 
during the baseline phase (i.e., sampling with¬ 
out replacement). These baseline data allowed 
us to predict the extent to which students’ 
ability to read previously unknown words 
would improve prior to introduction of PMIR. 

Oral Reading Fluency 

In order to determine if improvements in 
word reading generalized to connected text 
reading, we used AIMSweb reading curricu¬ 
lum-based measures (CBMs) to assess stu- 
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dents’ ORF. According to the technical man¬ 
ual (Pearson, 2012), resulting ORF data have 
sufficient alternate-form and test-retest reli¬ 
ability (r = .94). Evidence of validity was 
provided by moderate correlations between 
AIMSweb scores and state-test performance 
(r = .67-.70), as well as norm-referenced 
reading assessments (r = .67-.70). The vari¬ 
able of interest was the number of words read 
correctly per minute (WRCM). 

Graduate assistants administered one 
CBM probe at the beginning of each session, 
resulting in three ORF data points per week. 
Each target student was assessed using the 
progress monitoring probes for the grade in 
which he or she was enrolled. The 28 second- 
grade passages and 26 third-grade passages 
were administered in numerical order. Once 
participants had been assessed with the final 
grade-level passage, we randomly selected 
passages for the remaining sessions. 

Maintenance 

We assessed students’ ability to read all 
of the words taught during PMIR sessions 
approximately 10 days after the conclusion of 
the intervention phase. The number of words 
taught ranged from 54 to 75. We created a list 
of the words taught to each participant in 
Microsoft Word (black ink, 14-point font). 
Each participant was asked to read each word 
aloud. A word was considered learned if the 
participant read it aloud accurately within 3 s. 
A word was not considered learned if the 
participant took longer than 3 s to read the 
word correctly, read the word incorrectly, or 
did not provide a response. No corrective 
feedback was given during maintenance as¬ 
sessments. Students received a small tangible 
item (i.e., pencil or fruit snacks) and verbal 
praise after completing the assessment. 

Treatment Acceptability 

In order to examine treatment accept¬ 
ability, target students were asked to rate if the 
intervention helped them learn more words on 
a scale of 1 to 10. To provide a visual repre¬ 
sentation of this scale, a horizontal line with 
numbers 1 through 10 was drawn on a white¬ 
board with three cartoon faces above it. The face 


above numbers 1 through 3 had a frown, the face 
above the numbers 4 through 7 had neutral af¬ 
fect, and the face above numbers 8 through 10 
had a smile. Target students were also told that 
1 meant not at all, 5 meant some, and 10 meant 
yes definitely. Finally, target students were asked 
open-ended questions regarding what they liked 
and disliked about PMIR and if they would want 
to use PMIR again to learn new words. 

Peer tutors were asked if PMIR was easy 
to administer (yes or no) and if they thought 
PMIR helped the target students learn new 
words, using the same 1 to 10 visual scale de¬ 
scribed above. Peer tutors were also asked open- 
ended questions about what they liked and dis¬ 
liked about PMIR and if they would want to use 
PMIR again to teach other classmates new 
words. 

Experimental Design 

We examined the efficacy of PMIR us¬ 
ing a randomized, concurrent multiple-base¬ 
line design (MBD) across participants. We 
randomly assigned participants to the order in 
which the intervention was introduced (i.e., 
panels in Figures 1 and 2). We also randomly 
assigned the intervention start points (with a 
minimum two-session lag) using the random 
number function in Microsoft Excel. Using 
randomization to determine intervention start 
points, instead of the traditional response- 
guided approach, can reduce threats to internal 
validity such as experimenter bias (Kratoch- 
will & Levin, 2010). The use of randomization 
limits the flexibility found in response-guided 
approaches (e.g., determining when to imple¬ 
ment the intervention), but because this was an 
efficacy trial of PMIR, we sought to maximize 
the internal validity of the findings. 

Procedures 

The Institutional Review Board at the 
University of Wisconsin-Milwaukee approved 
this study prior to initiation. Two research assis¬ 
tants administered all assessments and con¬ 
ducted interventionist training after receiving 
training from the first author. Both students 
were completing their second semester in a 
school psychology program. The first graduate 
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Figure 1. Number of Words Read for Target Students 




Note. During both phases, words were randomly selected from an unknown set of words identified during the prebaseline 
assessment. During baseline, students were assessed using three randomly selected words. During intervention sessions, 
students were assessed on words taught during the previous session. 


student was a White man and a native English 
speaker. The second graduate student was a 
Black woman who spoke Shona and English 
fluently. Assistants were required to demonstrate 
100% fidelity and 90% agreement on the ORF 
measures prior to study initiation. All sessions 
took place in an unused classroom. 

Tutor Training 

Graduate assistants trained each peer tu¬ 
tor during three to four 20-minute sessions. 
Training included six major components. 
First, graduate assistants modeled the inter¬ 
vention for the peer interventionist by deliv¬ 
ering the intervention to the student. Second, 
interventionists were trained to introduce un¬ 
known words (i.e., “This word is cat. What 
word is this? Good, this word is cat.”) and 
provide standard corrective feedback (i.e., 
“This word is dog. What word? Good, this 
word is dog.”). Peer tutors had an opportunity 
to practice both of these components sepa¬ 


rately. Third, tutors were trained to present 
unknown words interspersed with known 
words. Fourth, tutors were trained to remove 
the last known word from the deck after they 
presented all seven known words and then to 
introduce the second unknown word. Fifth, 
tutors were given opportunities to practice ad¬ 
ministering the procedure to one of the grad¬ 
uate assistants while the other provided imme¬ 
diate corrective feedback. Sixth, tutors were 
given a script with blanks for each of the 
unknown words that were to be taught. They 
were also given a 3 X 5-in. index card with the 
corrective feedback procedure stated clearly. 
Tutors were able to use these materials during all 
intervention sessions. Tutors were considered 
sufficiently trained when they administered IR to 
a graduate assistant with 100% integrity. 

Baseline 

Graduate assistants collected ORF data 
using one probe at the start of each session. 
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Figure 2. ORF During Baseline and Intervention Phases 


Baseline 


Intervention 



Note. Data were collected three times per week. During each assessment, students read one AIMSweb ORF progress 
monitoring probe under standardized conditions. ORF probes were at the student’s grade level. ORF = oral reading 
fluency. 


resulting in a maximum of three data points 
per week. Next, word reading was assessed 
using three randomly selected unknown words. 
Baseline sessions took approximately three min¬ 
utes per student. The number of baseline ses¬ 
sions ranged from 5 to 14 (M = 9.4, Mdn = 9). 

Intervention 

Prior to each intervention session, grad¬ 
uate assistants randomly selected three un¬ 
known and seven known words. Each un¬ 
known word was rehearsed with seven known 
words to maintain an appropriate ratio for drill 
tasks. Prior to the intervention session, gradu¬ 
ate assistants asked peer tutors to read each 
word and prompted them to write the unknown 
words on a script. At the start of each session, 
graduate students collected ORF and word read¬ 
ing data while the tutors organized the known 
and unknown items in the correct order. 

Peer tutors administered IR following 
standardized procedures. In brief, tutors began 


each session by introducing IR and presenting 
the first unknown word. The target student 
was then asked to read the word aloud with the 
tutor providing immediate error correction if 
the target student responded incorrectly. The 
tutor presented the unknown (U) word and a 
sequence of known (K) words (i.e., Ul, Kl; 
Ul, Kl, K2; Ul, Kl, K2, K3; Ul, Kl, K2, K3, 
K4; Ul, Kl, K2, K3, K4, K5; Ul, Kl, K2, K3, 
K4, K5, K6; Ul, Kl, K2, K3, K4, K5, K6, 
K7). The tutor provided standard error correc¬ 
tion any time an error was made. If an error 
was made on a known word, the tutor started 
again with Ul. For example, if the target stu¬ 
dent made a mistake on K5, the tutor provided 
standard error correction and then started 
again with the sequence Ul, Kl, K2, K3, K4, 
K5 before continuing on. Once the unknown 
word was presented with all known words, the 
tutor removed K7 from the deck and intro¬ 
duced a second unknown item. This process 
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was completed until three unknown words 
were taught. Intervention sessions took ap¬ 
proximately ten minutes per student. The 
number of intervention sessions ranged from 8 
to 25 (M = 19, Mdn = 21). The fifth target 
student to start the intervention moved to an¬ 
other state after 9 weeks. 

Treatment Integrity and Interrater 
Agreement 

We created a fidelity checklist with 
eight discrete tasks necessary for the adminis¬ 
tration of IR. The steps included briefly intro¬ 
ducing the task, correctly introducing the 
unknown words, correctly completing the in- 
terspersal task, correctly adding another un¬ 
known word and removing a known word, and 
correctly using the standard error correction 
procedure. Graduate assistants observed 38% 
of the intervention sessions and recorded if 
each intervention step was completed cor¬ 
rectly. Treatment integrity was calculated by 
dividing the number of correctly completed 
steps by the total number of steps and multi¬ 
plying the result by 100. Average integrity 
was 97.4% (range = 75% to 100%). Graduate 
assistants provided corrective feedback after 
any session in which a step was conducted 
incorrectly. 

We collected assessment fidelity and in¬ 
terrater agreement data during 32% of baseline 
sessions (n = 15) and 34% of intervention 
sessions (n = 32). We created an assessment 
fidelity checklist for assessing target students’ 
word reading. The checklist included three 
steps: correctly presenting and prompting stu¬ 
dents to read the unknown word, correctly 
scoring words as known or unknown, and ac¬ 
curate recording of student performance. 
Word reading assessment fidelity was 100%. 
We used the AIMS web Accuracy of Implemen¬ 
tation rating scale to measure ORF fidelity. The 
measure includes 14 steps. Assessment fidelity 
for ORF averaged 99.5% (range = 92% to 
100 %). 

We calculated interrater agreement by 
dividing the number of agreements by the total 
agreements and disagreements. For word read¬ 
ing, interrater agreement was 100% during 


baseline and 97% during intervention. Aver¬ 
age interrater agreement for ORF data 
was 97.2% (SD = 2.2%) during baseline 
and 95.7% (SD = 3.4%) during intervention. 

Data Analyses 

Data analyses followed What Works 
Clearinghouse guidelines for evaluating inter¬ 
vention effects in single-case designs 
(Kratochwill et al., 2013). Visual analyses 
were used to determine if there was a func¬ 
tional relationship between PMIR and each 
dependent variable (i.e., word reading and 
ORF). If a functional effect was present, we 
proceeded to estimate the effect sizes using 
quantitative methods. 

Visual Analyses 

Visual analyses were conducted for each 
participant and outcome variable. First, we 
examined the level, trend, and stability of data 
within each phase. We estimated within-phase 
stability by calculating the percentage of data 
points within 20% of the phase median. Sec¬ 
ond, we examined the immediacy of effect, 
consistency of data patterns, and overlap of 
data between baseline and PMIR phases. We 
examined between-phase changes in level by 
evaluating the relative, absolute, median, and 
mean level change (Lane & Gast, 2014). Three 
demonstrations of an intervention effect are 
necessary for establishing a functional rela¬ 
tionship (Kratochwill et al., 2013). 

A cross-Case Effect Sizes 

If visual analysis indicated a functional 
relationship, we proceeded to estimate across- 
case effect sizes using Taut/ and multilevel 
modeling. Taut/ provides a nonparametric in¬ 
dex of the percentage of data that do not 
overlap minus the percentage of data that 
overlap between baseline and control phases 
(Parker, Vannest, & Davis, 2014). Taut/ is 
based on Kendall’s rank correlation and the 
Mann-Whitney test for two groups, both of 
which follow the S distribution. Taut/ im¬ 
proved upon previous indexes of nonoverlap 
by following a known distribution and allow¬ 
ing for the control of trends in baseline data 
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(Parker et al., 2014). Parker and Vannest 
(2009) provided tentative guidelines for inter¬ 
preting the nonoverlap of all pairs statistic, 
which can be transformed into Taut/. Effect 
sizes from 0 to 0.31 are small; from 0.32 
to 0.84, medium; and from 0.85 to 1.0, large 
(Parker & Vannest, 2009). 

We calculated Taut/ for each target stu¬ 
dent and a weighted, across-case Taut/ using 
an online calculator (Vannest, Parker, & 
Gonen, 2011). In brief, we assessed the trends 
in the baseline data for each student. Signifi¬ 
cant trends in the baseline data (p < .15) were 
controlled for prior to calculating the phase 
contrast between baseline and PM1R phases 
for each participant. Finally, we used the cal¬ 
culator to estimate the weighted average TauU 
across all participants. 

We also used multilevel modeling to 
estimate the effects of PMIR across cases. In 
single-case designs, data are hierarchically 
structured: Measurements are nested within 
participants, and participants in turn are nested 
within studies. Ignoring the multilayered na¬ 
ture of MBD data can adversely impact the 
conclusions of a single-case analysis (Van den 
Noortgate, Opdenakker, & Onghena, 2005) as 
standard error estimates will be too small, 
resulting in an inflated number of Type I errors 
when used in statistical tests. Multilevel mod¬ 
eling is a recommended approach to analyze 
data characterized by a two-level data struc¬ 
ture (Ferron, Bell, Hess, Rendina-Gobioff, & 
Hibbard, 2009). After checking the tenability 
of regression assumptions, we used the multi¬ 
level model suggested by Van den Noortgate 
and Onghena (2003, 2008) to estimate (a) the 
overall average effect of PMIR across partic¬ 
ipants and (b) the within- and between-partic- 
ipant variability. Using an MBD across 5 par¬ 
ticipants, with 30 measurement occasions, 
provided sufficient power (> .80) to detect 
large treatment effects (Ferron, Moeyaert, Van 
Den Noortgate, & Beretvas, 2014). We con¬ 
ducted all statistical analyses using SAS Ver¬ 
sion 9.4 (SAS Institute, 2011-2014) and used 
a significance level of .05 for all statistical 
tests. 


RESULTS 

In the following sections, we describe 
the visual analysis results for each student. 
Then, when visual analysis indicated that a 
functional relationship exists, we discuss the 
across case effect sizes. 

Question Is Effect of PMIR on Word 
Reading 

We hypothesized that the implementa¬ 
tion of PMIR would be functionally related to 
an increase in the level of students’ word 
reading. Results are shown in Figure 1. 

Martin 

As shown in the first panel of Figure 1, 
Martin’s performance during baseline was low 
( M = 0.2, Mdn = 0) and stable (i.e., 80% of 
data within ± 20% of phase median). There was 
no apparent trend in baseline performance. Mar¬ 
tin’s performance during PMIR increased in 
level (M = 2.24, Mdn = 2). His performance 
was highly variable (i.e., 28% of data points 
within ± 20% of phase median), and the relative 
level of Martin’s performance decreased slightly 
across time. The observed between-phase 
change in performance was immediate and con¬ 
sistent (e.g., median level change = 2). There 
was very little overlap of the data between the 
two phases. Despite the increased variability of 
performance during PMIR, results indicated that 
PMIR had an effect on word reading for Martin. 

Lucia 

Lucia’s performance during baseline 
was low (M = 0.43, Mdn = 0) and relatively 
stable (i.e., 71% of data within ± 20% of 
phase median). There was an abrupt increase 
in her performance during the middle of the 
baseline phase followed by a negative trend. 
During PMIR, Lucia’s performance increased in 
level ( M = 2.56, Mdn = 3) and variability (i.e., 
65% of data within ± 20% of phase median). 
Her performance during PMIR decreased 
slightly between Sessions 9 and 13 but increased 
between Sessions 19 and 23. There was an im¬ 
mediate and consistent change in Lucia’s perfor¬ 
mance between phases. There was one overlap¬ 
ping data point between the two phases. Results 
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indicated that PMIR had an effect on Lucia’s 
word reading despite the change in trend during 
the PMIR phase. 

Iker 

Iker’s performance during baseline was 
low (M = 0.11, Mdn = 0) and stable (i.e., 
89% of data within ± 20% of phase median). 
No trends in his performance were evident 
during baseline. During PMIR, Iker’s perfor¬ 
mance increased ( M = 2.29, Mdn = 2) and 
became more variable (i.e., 43% of data 
within ± 20% of phase median). No trends 
were apparent during PMIR. Iker’s perfor¬ 
mance between phases increased immediately 
and was consistent (e.g., median level 
change = 2). There was no overlap between 
the two phases. PMIR had an effect on Iker’s 
word reading. 

Sofia 

Sofia’s performance during baseline was 
low ( M = 0.08, Mdn = 0) and stable (i.e., 
92% of data within ± 20% of phase median). 
Her performance during PMIR increased 
(M = 2.39, Mdn = 2.5) and was generally 
consistent (i.e., 89% of data within ± 20% 
of phase median). There was no trend ap¬ 
parent in her performance during either the 
baseline phase or the PMIR phase. Between- 
phase change in Sofia’s performance was 
immediate and consistent (e.g., median level 
change = 2.5). There was no overlap in the 
data between phases. PMIR had an effect on 
Sofia’s word reading. 

Tomas 

Tomas demonstrated low performance 
during baseline (M = 0.64, Mdn = 0); how¬ 
ever, his performance was relatively variable 
(i.e., 57% of data within ± 20% of phase 
median). Tomas’ performance improved dur¬ 
ing the baseline phase. During PMIR, Tomas’ 
performance increased in level (M = 2.28, 
Mdn = 3) and consistency (i.e., 87.5% of data 
within ± 20% of phase median). No trends 
emerged during the PMIR phase. Although the 
observed change in performance was consis¬ 
tent (e.g., median level change = 3), the im¬ 
mediacy of the effect was unclear because of 
the improvement during Baseline Sessions 9 


and 13. Two of the data points overlapped 
between phases. Despite the change in perfor¬ 
mance during baseline, the relative consis¬ 
tency in performance during intervention in¬ 
dicated that PMIR had an effect on Tomas’ 
word reading. 

Similar patterns occurred for all 5 par¬ 
ticipants with one caveat. Tomas’ perfor¬ 
mance increased when PMIR was introduced 
for Iker; however, the increase was brief and 
his performance decelerated before PMIR 
was initiated. Moreover, Tomas’ perfor¬ 
mance immediately improved during PMIR, 
and this improvement was consistent. Visual 
analyses indicated that there was a func¬ 
tional relationship between PMIR and sight- 
word reading. 

A cross-Case Effect Size 

In order to estimate an omnibus effect 
size across all 5 participants, we calculated 
individual Taut/ values for each participant 
and calculated a weighted average. The signif¬ 
icant baseline trend (p < .15) was controlled 
for Tomas. Across all 5 participants, PMIR 
had a large effect on sight-word reading, 
Taut/ = 0.91, 95% Cl [0.69, 1.13]. This effect 
was statistically significant (p < .001). 

We also estimated the effect of PMIR on 
word reading using multilevel modeling. For 
Question 1, the dependent variable (i.e., num¬ 
ber of words read correctly out of 3) represents 
a proportion for which a two-level logistic 
model is recommended over a Poisson model 
(Moeyaert, Ferron, Beretvas, & Van den 
Noortgate, 2014). Consistent with our hypoth¬ 
esis, visual analysis indicated that PMIR re¬ 
sulted in a change in level and few trends 
emerged during baseline or intervention 
phases. Thus, we decided to model changes in 
level but not slope. Moreover, baseline obser¬ 
vations were stable and we did not correct for 
time trends. The resulting Level 1 model was 
as follows: 

l0g (F=^) = Po; + P i jPhasey ( 1 ) 

in which <p indicates the expected pro¬ 
portion of retained words within Session i for 
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Subject j: <p — \y Phase is a dummy-coded 

variable equaling zero if Measurement Occa¬ 
sion i nested within Participant j belongs to the 
baseline, one otherwise. As a consequence, (3 0/ 
and (3 |■ indicate the predicted number of re¬ 
tained words for Case j during the baseline 
phase and the treatment phase, respectively. In 
order to estimate the effects across cases, a 
second level was added by allowing the 
Level 1 coefficients from Equations 1 and 2 
(i.e., p 0j and (3,/) to vary: 

Poj = ©00 + u 0j 

PlO = 010 + u lj 


with 


u 0jk 
«1 jk 




’ (7 

M1M0 


^UQUl 



( 2 ) 


The residuals at the second level are as¬ 
sumed to be multivariate normally distributed. 

For Question 1, we were especially in¬ 
terested in the estimate of a 00 and cr I0 , indi¬ 
cating the overall average baseline level and 
treatment level across participants. In addition, 
the multilevel analysis allows estimation of 
the variability of these estimates across cases 
within the same study. These coefficients are 
obtained by estimating the diagonal elements 
of the covariance matrix in Equation 2. The 
(j 2 U0 and (j 2 n reflect the between-case variance 
in baseline level and treatment level, respec¬ 
tively. 

Participants recalled a low number of 
words during the baseline phase. After back- 
transforming from the log odds (—2.333), the 
average number of retained words during 
baseline was 0.265, r(4) = 2.17,/? = .095. The 
average number of retained words during the 
intervention phase was significantly higher. 
With application of the same back-transforma¬ 
tion to the resulting log odds (3.901), the av¬ 
erage number of retained words during the 
intervention phase was 2.94, f(4) = 10.88 ,p < 
.001. Thus, the across-case effect of PMIR on 
the reading of words equals 2.675, t( 4) = 2.15, 
p < .001. Observed between-participant vari¬ 
ability in the effect of PMIR was low, 0.075 
(SE = 0.166), and statistically significant. 


Question 2: Effect of PMIR on ORF 

The second question examined whether 
the observed effect of PMIR would generalize 
to improvements in ORF (see Figure 2). We 
hypothesized that PMIR would have a de¬ 
layed, small effect on ORF. As discussed be¬ 
low, results of the visual analysis revealed that 
there was no functional relationship between 
PMIR and ORF. Therefore, we did not esti¬ 
mate across-case effect sizes. The percentiles 
presented for each student were calculated us¬ 
ing the AIMSweb national norms for the win¬ 
ter administration based on the student’s cur¬ 
rent grade level. 

Martin 

During baseline, Martin’s ORF (Mdn = 
25) was between the first and second percen¬ 
tiles for third-grade students. Martin’s ORF 
was relatively stable (80% of data within ± 
20% of phase median). There was a positive 
trend in his ORF during baseline. During 
PMIR, Martin’s ORF (Mdn = 33) was be¬ 
tween the second and third percentiles. Al¬ 
though his performance varied (i.e., 72% of 
data points within ± 20% of phase median), 
there was a small positive trend in his ORF 
performance. When Martin’s performance be¬ 
tween phases was examined, there was no 
immediate change in performance and there 
was substantial overlap in the data. There was 
a small positive change in level (2 WRCM). 
Taken together, visual analysis did not indi¬ 
cate that PMIR had an effect on Martin’s 
ORF. 

Lucia 

Lucia’s ORF during baseline (Mdn = 
59) was at the 11th percentile for third-grade 
students. There was a large absolute level 
change (23 WRCM) and a positive trend in her 
performance during baseline. Her perfor¬ 
mance during baseline was variable (i.e., 71% 
of data within ± 20% of phase median). Dur¬ 
ing PMIR, Lucia’s ORF (Mdn = 55) was 
between the 10th and 11th percentiles for 
third-grade students. Her performance was 
variable (i.e., 74% of data within ± 20% of 
phase median). The relative level change and 
absolute level change were positive and sug- 
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gestive of a small positive trend in her ORF 
during PMIR. 

There was an immediate decrease in Lu¬ 
cia’s ORF after the implementation of PMIR. 
Relative, absolute, and median level changes 
reflected a negative change in ORF between 
phases. The positive trend in ORF during 
baseline decelerated after implementation of 
PMIR, and there was substantial overlap be¬ 
tween phases. Visual analysis did not indicate 
that PMIR had an effect on Lucia’s ORF. 

Iker 

Iker’s ORF during baseline (Mdn =13) 
was at the second percentile for second-grade 
students. His performance during baseline was 
varied (i.e., 56% of data within ± 20% of 
phase median). The relative and absolute level 
changes during baseline were positive, and 
there was a small, positive trend in perfor¬ 
mance. During PMIR, Iker’s ORF (Mdn = 23) 
was between the fourth and fifth percentiles 
for second-grade students. Iker’s performance 
was varied (i.e., 52% of data within ± 20% of 
phase median). The relative level change and 
absolute level change were positive, and there 
was a small positive trend in his ORF perfor¬ 
mance during PMIR. 

PMIR was associated with an immediate 
decrease in Iker’s ORF. However, this change 
was not consistent with positive relative, me¬ 
dian, and mean level changes suggesting im¬ 
provement from baseline to PMIR. The posi¬ 
tive trend in ORF during baseline accelerated 
after implementation of PMIR, but there was 
substantial overlap between phases. Visual 
analysis did not indicate that PMIR had an 
effect on Iker’s ORF. 

Sofia 

Sofia’s baseline ORF (Mdn = 24) was at 
the fifth percentile for second-grade students. 
Sofia’s performance during baseline was 
somewhat varied (i.e., 75% of data within ± 
20% of phase median). Estimates of trend 
were inconsistent with negative relative and 
absolute level changes observed but a small 
positive trend in her ORF slope. During 
PMIR, Sofia’s median ORF (Mdn = 28) was 
at the sixth percentile for second-grade stu¬ 


dents. Sofia’s performance was varied (i.e., 
52% of data within ± 20% of phase median), 
and this variability appeared to increase 
throughout the PMIR phase. The relative level 
change (14 WRCM) and absolute level change 
(31 WRCM) were large, and there was a pos¬ 
itive trend in her ORF performance during 
PMIR. 

PMIR was associated with an immediate 
decrease in Sofia’s ORF (3 WRCM). Her rel¬ 
ative level change was negative, while the 
median and mean level changes were positive. 
The positive trend in ORF during baseline 
accelerated after implementation of PMIR. 
There was substantial overlap in the data be¬ 
tween phases. Visual analysis did not indicate 
that PMIR had an effect on Sofia’s ORF. 

Tomas 

During baseline, Tomas’ median ORF 
(Mdn = 56) was at the 18th percentile for 
second-grade students. Tomas’ performance 
during baseline was stable (i.e., 93% of data 
within ± 20% of phase median). Within-phase 
relative and absolute level changes were neg¬ 
ative, and his ORF performance decreased 
over time. During PMIR, Tomas’ ORF 
(Mdn = 54) was at the 16th percentile for 
second-grade students. His ORF during PMIR 
was stable (100% of data within ± 20% of 
phase median). There was a small, negative 
relative level change (—0.5 WRCM) and large 
absolute level change (5 WRCM). His ORF 
slightly decreased during PMIR. 

There was a small, positive improve¬ 
ment in ORF between phases. This positive 
effect was not consistent, with small, positive 
changes in the relative and absolute level 
changes but negative changes in the median 
and mean levels. The decreasing trend in ORF 
during baseline decelerated after implementa¬ 
tion of PMIR but remained negative. There 
was substantial overlap between phases. Vi¬ 
sual analysis did not indicate that PMIR had 
an effect on ORF for Tomas. 

Overlap of Words Taught 

Because of the lack of effect on ORF, 
we examined the overlap between the un¬ 
known words taught and the words contained 
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in the AIMSweb passages. While some un¬ 
known words were common across students, 
each word was only included once for these 
analyses. Iker, Sofia, and Tomas were as¬ 
sessed using second-grade passages. A total of 
132 of the 155 unknown words taught (81%) 
appeared at least once in an AIMSweb pas¬ 
sage. The median number of passages that 
included an unknown word was 5 (range = 1 
to 26), and the average number of appearances 
per page was 1.67 (range = 1 to 6.11). The 
unknown words appeared a total of 1,467 
times, which represents approximately 22.3% 
of the total words in the second-grade pas¬ 
sages. 

Lucia and Martin were assessed using 
third-grade passages. A total of 95 of the 142 
unknown words taught (67%) appeared at 
least once in an AIMSweb passage. The me¬ 
dian number of passages that included a given 
unknown word was 4 (range = 1 to 27). The 
average number of appearances per page 
was 1.50 (range = 1 to 7.33). The unknown 
words appeared a total of 1,016 times, which 
represents approximately 12.6% of the total 
words in the third-grade passages. Although 
we did not analyze the placement of the 
words in the passage to determine if the 
student reached the words, these data indi¬ 
cate that there was little overlap between the 
words taught and the words contained in the 
AIMSweb passages. 

Question 3: Maintenance of PMIR 
Effects 

The third research question pertained to 
the durability of any observed PMIR effects. 
Given the lack of a functional relationship 
between PMIR and ORF, we did not discuss 
maintenance data for ORF. We assessed main¬ 
tenance by examining the number of words the 
student read correctly from all of the words 
taught during PMIR. This assessment oc¬ 
curred 10 days after the final intervention ses¬ 
sion for all students, except Martin, who com¬ 
pleted the assessment 8 days after the final 
intervention session because of absences. 

Sofia (47 of 54; 87%) and Lucia (53 of 
69; 77%) were able to read more than 75% of 


the words taught during PMIR sessions after 
intervention completion. Martin (51 of 75; 
68 %) and Iker (42 of 63; 67%) correctly read 
a smaller percentage of words taught during 
PMIR but still more than 50%. Overall, par¬ 
ticipants correctly read an average of 74.6% of 
the words taught (SD = 9.4%) during PMIR. 

Treatment Acceptability 

Treatment acceptability data were col¬ 
lected from all target students except for To¬ 
mas. The four target students reported that 
PMIR helped them learn new words (M = 9). 
The target students reported generally liking 
PMIR, although one participant reported not 
liking reading the ORF passages prior to each 
session. All target students reported they 
would participate in PMIR again to learn new 
words. Tutors reported that PMIR helped 
other students learn other words (M = 8.8). 
All tutors reported liking teaching the target 
students and reported that they would want to 
use PMIR to teach other students again. One 
tutor did comment on the perceived difficulty 
of the interspersal procedure, reporting that 
she felt like she made too many mistakes 
during PMIR. 

DISCUSSION 

Previous research has supported the use 
of IR for teaching words; however, the inten¬ 
sity of the intervention likely precludes its 
use in most situations. Methods to decrease 
the resource intensity of IR while maintaining 
the causal mechanisms may increase the rele¬ 
vance of IR for schools that serve large pop¬ 
ulations of ELL students. Using peer interven¬ 
tionists appears to be a promising modification 
of IR for ELL students, but this modification of 
IR has yet to be empirically investigated. To fill 
this gap in the literature, this efficacy study in¬ 
vestigated the effects of PMIR for ELL students. 

Results from this study indicated that 
there was a functional relationship between 
PMIR and word reading. Visual analyses sug¬ 
gested that PMIR had an effect on word read¬ 
ing for all 5 participants. Large across-case 
effect sizes were found using Taut/ and mul¬ 
tilevel modeling of raw data. The estimated 
effect sizes were consistent across all 5 par- 
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ticipants. Results also suggested that the im¬ 
provements in word reading were maintained 
over time. Approximately 10 days after the 
intervention, participants retained a high per¬ 
centage of taught words. 

The large effect of PMIR on word read¬ 
ing was generally consistent with the average 
treatment effect of IR implemented by adults 
found in 12 studies (Burns et al., 2012). In 
addition, the peer interventionists indicated 
that PMIR was easy to implement after some 
practice. These findings are promising because 
they suggest peers can deliver IR with a similar 
degree of effectiveness to adults, although we 
did not test that hypothesis here. Reducing the 
adult time necessary to deliver IR while main¬ 
taining the causal mechanisms of the interven¬ 
tion could increase the contextual validity of IR. 
Moreover, the use of PMIR is consistent with the 
Institute of Education Sciences recommenda¬ 
tions for providing structured instructional activ¬ 
ities for ELL students with peers of differing 
English proficiencies (Gersten et al., 2007). 

Contrary to our hypothesis, IR did not 
have an effect on participants’ ORF. Recent 
meta-analytic research indicated that IR was 
associated with a moderate to large effect on 
students’ ORF in three studies (Burns et al., 
2012). Multiple hypotheses may explain the 
lack of effect. First, we used grade-level pas¬ 
sages to assess ORF. Given students’ baseline 
performance (< 11th percentile), these pas¬ 
sages were likely at the frustration level. Us¬ 
ing instructional level passages may have been 
more sensitive to small changes in ORF per¬ 
formance (Shapiro, 2011). Second, we did not 
use IR to preteach words contained in the 
passages. In previous studies of IR, research¬ 
ers taught unknown words that were contained 
in the passage and then examined differences 
in ORF (e.g., Burns, 2007; Bums, Dean, & 
Foley, 2004). There was little overlap in the 
unknown words and the AIMS web passages in 
this study. Third, nine words being taught per 
week may have been too few to affect stu¬ 
dents’ ORF. We taught three words per ses¬ 
sion to avoid exceeding students’ acquisition 
rates. Exceeding a student’s acquisition rate can 
reduce learning and reading (Flelman & Burns, 
2008), a result that may have led us to conclude 


that peers could not effectively deliver IR. Be¬ 
cause of the small number of words taught per 
week, word reading fluency may have provided 
a better measure of generalization. 

Sight-word reading is an important pre¬ 
cursor to upper-level reading skills such as 
fluency and comprehension (National Institute 
of Child Health and Human Development, 
2000). ELL students with early deficits in 
word reading skills are likely to have later 
delays in word reading skills (Lesaux et al., 
2007). Improving word reading is an important 
intervention target for students with a limited 
sight-word vocabulary (Hudson et al., 2012; 
Vaughn et al., 2005). However, these improve¬ 
ments are unlikely to generalize to improve¬ 
ments in ORF without incorporating practice of 
connected text reading (Daly, Neugebauer, Cha- 
fouleas, & Skinner, 2015). Future research could 
examine the efficacy and efficiency of using 
peers to deliver IR within a complex intervention 
package that also targets fluency. 

Limitations and Future Directions for 
Research 

Results from this study must be inter¬ 
preted in the context of the limitations. First, 
the use of peer tutors is a promising approach 
for increasing the contextual validity of IR. 
Our findings suggest that peers can deliver IR 
with similar effectiveness to adults. Docu¬ 
menting the efficacy of PMIR was a necessary 
precursor to comparing PMIR with other in¬ 
terventions, but it is important to note that 
these data do not answer questions regarding 
the comparative effectiveness or the compar¬ 
ative efficiency of PMIR. Future research is 
necessary to examine the efficiency of PMIR 
compared with other interventions delivered 
by adults or other peers. 

Second, single-case designs have strong 
internal validity but replication is needed to 
establish an intervention as evidence based 
(Kratochwill et al., 2013). In this initial efficacy 
study, the effects of PMIR were replicated 
across 5 participants with small variability in the 
effects between participants. Additional research 
across participants and outcomes is needed be¬ 
fore PMIR can be considered evidence based. 
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For example, future research could examine if 
PMIR is effective for teaching words to ELL 
students who speak a primary language other 
than Spanish, if PMIR is effective for teaching 
skills other than word reading (e.g., math facts), 
or if PMIR can be used as a reciprocal peer 
tutoring intervention. 

Third, we used randomization to select 
intervention start points prior to the study. 
Incorporating randomization into single-case 
research may improve the credibility of the 
results (Kratochwill & Levin, 2010) but de¬ 
creases the flexibility found in the traditional 
response-guided approach. In this study, To¬ 
mas began PMIR while his baseline perfor¬ 
mance was variable. We started PMIR during 
the session that was randomly assigned, but a 
response-guided approach would have al¬ 
lowed us to extend the baseline until Tomas’ 
performance was stable. We prioritized the 
benefits of randomization because this was an 
efficacy trial of PMIR, but this approach did 
introduce ambiguity into the visual analysis. 

Fourth, we combined data from 5 par¬ 
ticipants in order to estimate the overall aver¬ 
age effect of PMIR. Previous research on the 
multilevel modeling of single-case data (Mo- 
eyaert et al., 2014) indicated that our sample 
size was sufficient for obtaining an unbiased 
and precise estimate of the fixed (i.e., treat¬ 
ment) effects, but caution is warranted when 
interpreting between-case variance estimates 
as they may be biased. More research on 
PMIR using MBDs would allow for more 
accurate estimates of variance components. 

CONCLUSION 

School psychologists are likely to be 
involved in supporting ELL students’ reading 
achievement. This requires evidence-based in¬ 
terventions, studied with ELL students di¬ 
rectly, that are contextually valid. PMIR had 
similar effects to previous studies of IR deliv¬ 
ered by adults. These results also provide ad¬ 
ditional support for using IR procedures with 
ELL students. Using peer interventionists has 
promise for maintaining the benefits of IR 
while increasing the contextual validity of the 
intervention. PMIR could also be used as a 


structured learning activity for peers of differ¬ 
ent English proficiency levels. More research 
is needed before PMIR can be considered an 
evidence-based practice, but practitioners 
could consider its use for teaching high-fre¬ 
quency words to ELL students. As with any 
intervention, practitioners must monitor short¬ 
term and long-term outcomes carefully to en¬ 
sure PMIR is having the desired effects. 
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